-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AArch64][SVE] Remove isSVECC() in favour of changing the calling convention #152742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…vention We essentially had two different ways of setting the calling convention for a function: the proper calling convention and a target flag to override the calling convention. Removing the flag in favour of setting the calling convention is an NFC except in some (somewhat unsupported) cases. The cases that needed changes were: 1. Removing the "SVE_VectorCall is unsupported on Darwin" failure - The previous method of setting the SVE calling convention means that it has been possible to use the SVE CC on Darwin for a while now (and some tests depend on that). - It is unclear if this really should be allowed or not (but does pose a question for SME functions on Darwin that have SVE params) 2. Removing a Windows SVE test that used varargs and SVE arguments - This is not supported (and already fails LowerCall) 3. Updating the checks for a Windows + exceptions SVE test Note: The updated logic intends to closely match the code in `AArch64TargetLowering::LowerCall` (which also changes the CC).
@llvm/pr-subscribers-backend-aarch64 Author: Benjamin Maxwell (MacDue) ChangesWe essentially had two different ways of setting the calling convention for a function: the proper calling convention and a target flag to override the calling convention. Removing the flag in favour of setting the calling convention is an NFC except in some (somewhat unsupported) cases. The cases that needed changes were:
Note: The updated logic intends to closely match the code in Full diff: https://github.com/llvm/llvm-project/pull/152742.diff 7 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index c52487ab8a79a..f7080465b4880 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -1356,8 +1356,7 @@ void AArch64AsmPrinter::emitFunctionEntryLabel() {
if (TT.isOSBinFormatELF() &&
(MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall ||
MF->getFunction().getCallingConv() ==
- CallingConv::AArch64_SVE_VectorCall ||
- MF->getInfo<AArch64FunctionInfo>()->isSVECC())) {
+ CallingConv::AArch64_SVE_VectorCall)) {
auto *TS =
static_cast<AArch64TargetStreamer *>(OutStreamer->getTargetStreamer());
TS->emitDirectiveVariantPCS(CurrentFnSym);
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 885f2a94f85f5..3d8d79de17517 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -339,10 +339,10 @@ static bool requiresSaveVG(const MachineFunction &MF);
// on the stack. This function is safe to be called before callee-saves or
// object offsets have been determined.
static bool isLikelyToHaveSVEStack(MachineFunction &MF) {
- auto *AFI = MF.getInfo<AArch64FunctionInfo>();
- if (AFI->isSVECC())
+ if (MF.getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall)
return true;
+ auto *AFI = MF.getInfo<AArch64FunctionInfo>();
if (AFI->hasCalculatedStackSizeSVE())
return bool(getSVEStackSize(MF));
@@ -3121,12 +3121,14 @@ static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) {
static bool produceCompactUnwindFrame(MachineFunction &MF) {
const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
AttributeList Attrs = MF.getFunction().getAttributes();
- AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
return Subtarget.isTargetMachO() &&
!(Subtarget.getTargetLowering()->supportSwiftError() &&
Attrs.hasAttrSomewhere(Attribute::SwiftError)) &&
MF.getFunction().getCallingConv() != CallingConv::SwiftTail &&
- !requiresSaveVG(MF) && !AFI->isSVECC();
+ MF.getFunction().getCallingConv() !=
+ CallingConv::AArch64_SVE_VectorCall &&
+
+ !requiresSaveVG(MF);
}
static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index bad7ccde2e1e0..a8565b899dcf0 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24,6 +24,7 @@
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/SmallVectorExtras.h"
@@ -7837,7 +7838,7 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,
SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {
MachineFunction &MF = DAG.getMachineFunction();
- const Function &F = MF.getFunction();
+ Function &F = MF.getFunction();
MachineFrameInfo &MFI = MF.getFrameInfo();
bool IsWin64 =
Subtarget->isCallingConvWin64(F.getCallingConv(), F.isVarArg());
@@ -7845,16 +7846,36 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
(isVarArg && Subtarget->isWindowsArm64EC());
AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
- SmallVector<ISD::OutputArg, 4> Outs;
- GetReturnInfo(CallConv, F.getReturnType(), F.getAttributes(), Outs,
- DAG.getTargetLoweringInfo(), MF.getDataLayout());
- if (any_of(Outs, [](ISD::OutputArg &Out){ return Out.VT.isScalableVector(); }))
- FuncInfo->setIsSVECC(true);
-
// Assign locations to all of the incoming arguments.
SmallVector<CCValAssign, 16> ArgLocs;
CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());
+ // This logic is consistent with AArch64TargetLowering::LowerCall.
+ // The `ShouldUpgradeToSVECC` flag can be when analyzing arguments.
+ bool ShouldUpgradeToSVECC = false;
+ auto _ = make_scope_exit([&] {
+ if (CallConv != CallingConv::C && CallConv != CallingConv::Fast)
+ return;
+
+ if (!ShouldUpgradeToSVECC) {
+ // If the flag was not set, check if the return value requires the SVE CC.
+ SmallVector<ISD::OutputArg, 4> Outs;
+ GetReturnInfo(CallConv, F.getReturnType(), F.getAttributes(), Outs,
+ DAG.getTargetLoweringInfo(), MF.getDataLayout());
+ ShouldUpgradeToSVECC = any_of(
+ Outs, [](ISD::OutputArg &Out) { return Out.VT.isScalableVector(); });
+ }
+
+ if (!ShouldUpgradeToSVECC)
+ return;
+
+ if (isVarArg)
+ report_fatal_error("Passing/returning SVE types to variadic functions "
+ "is currently not supported");
+
+ F.setCallingConv(CallingConv::AArch64_SVE_VectorCall);
+ });
+
// At this point, Ins[].VT may already be promoted to i32. To correctly
// handle passing i8 as i8 instead of i32 on stack, we pass in both i32 and
// i8 to CC_AArch64_AAPCS with i32 being ValVT and i8 being LocVT.
@@ -7942,14 +7963,14 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
RC = &AArch64::FPR128RegClass;
else if (RegVT.isScalableVector() &&
RegVT.getVectorElementType() == MVT::i1) {
- FuncInfo->setIsSVECC(true);
RC = &AArch64::PPRRegClass;
+ ShouldUpgradeToSVECC = true;
} else if (RegVT == MVT::aarch64svcount) {
- FuncInfo->setIsSVECC(true);
RC = &AArch64::PPRRegClass;
+ ShouldUpgradeToSVECC = true;
} else if (RegVT.isScalableVector()) {
- FuncInfo->setIsSVECC(true);
RC = &AArch64::ZPRRegClass;
+ ShouldUpgradeToSVECC = true;
} else
llvm_unreachable("RegVT not supported by FORMAL_ARGUMENTS Lowering");
@@ -8597,14 +8618,6 @@ bool AArch64TargetLowering::isEligibleForTailCallOptimization(
CallAttrs.caller().hasStreamingBody())
return false;
- // Functions using the C or Fast calling convention that have an SVE signature
- // preserve more registers and should assume the SVE_VectorCall CC.
- // The check for matching callee-saved regs will determine whether it is
- // eligible for TCO.
- if ((CallerCC == CallingConv::C || CallerCC == CallingConv::Fast) &&
- MF.getInfo<AArch64FunctionInfo>()->isSVECC())
- CallerCC = CallingConv::AArch64_SVE_VectorCall;
-
bool CCMatch = CallerCC == CalleeCC;
// When using the Windows calling convention on a non-windows OS, we want
diff --git a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
index 800787cc0b4f5..dd7b97619f938 100644
--- a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
@@ -209,10 +209,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
bool IsMTETagged = false;
- /// The function has Scalable Vector or Scalable Predicate register argument
- /// or return type
- bool IsSVECC = false;
-
/// The frame-index for the TPIDR2 object used for lazy saves.
TPIDR2Object TPIDR2;
@@ -280,9 +276,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
int64_t getStreamingVGIdx() const { return StreamingVGIdx; };
void setStreamingVGIdx(unsigned FrameIdx) { StreamingVGIdx = FrameIdx; };
- bool isSVECC() const { return IsSVECC; };
- void setIsSVECC(bool s) { IsSVECC = s; };
-
TPIDR2Object &getTPIDR2Obj() { return TPIDR2; }
void initializeBaseYamlFields(const yaml::AArch64FunctionInfo &YamlMFI);
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 77dfab83a834a..81d9e02d548f1 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -104,8 +104,6 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
if (MF->getFunction().getCallingConv() ==
CallingConv::AArch64_SVE_VectorCall)
return CSR_Win_AArch64_SVE_AAPCS_SaveList;
- if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
- return CSR_Win_AArch64_SVE_AAPCS_SaveList;
return CSR_Win_AArch64_AAPCS_SaveList;
}
if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall)
@@ -148,8 +146,6 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
// This is for OSes other than Windows; Windows is a separate case further
// above.
return CSR_AArch64_AAPCS_X18_SaveList;
- if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
- return CSR_AArch64_SVE_AAPCS_SaveList;
return CSR_AArch64_AAPCS_SaveList;
}
@@ -165,8 +161,7 @@ AArch64RegisterInfo::getDarwinCalleeSavedRegs(const MachineFunction *MF) const {
if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall)
return CSR_Darwin_AArch64_AAVPCS_SaveList;
if (MF->getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall)
- report_fatal_error(
- "Calling convention SVE_VectorCall is unsupported on Darwin.");
+ return CSR_Darwin_AArch64_SVE_AAPCS_SaveList;
if (MF->getFunction().getCallingConv() ==
CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X0)
report_fatal_error(
@@ -205,8 +200,6 @@ AArch64RegisterInfo::getDarwinCalleeSavedRegs(const MachineFunction *MF) const {
return CSR_Darwin_AArch64_RT_AllRegs_SaveList;
if (MF->getFunction().getCallingConv() == CallingConv::Win64)
return CSR_Darwin_AArch64_AAPCS_Win64_SaveList;
- if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
- return CSR_Darwin_AArch64_SVE_AAPCS_SaveList;
return CSR_Darwin_AArch64_AAPCS_SaveList;
}
diff --git a/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll b/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
index 9e18f778d1c46..9ec4bea520c53 100644
--- a/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
@@ -1,9 +1,7 @@
; RUN: sed -e "s,CC,cfguard_checkcc,g" %s | not --crash llc -mtriple=arm64-apple-darwin -o - 2>&1 | FileCheck %s --check-prefix=CFGUARD
-; RUN: sed -e "s,CC,aarch64_sve_vector_pcs,g" %s | not --crash llc -mtriple=arm64-apple-darwin -o - 2>&1 | FileCheck %s --check-prefix=SVE_VECTOR_PCS
define CC void @f0() {
unreachable
}
; CFGUARD: Calling convention CFGuard_Check is unsupported on Darwin.
-; SVE_VECTOR_PCS: Calling convention SVE_VectorCall is unsupported on Darwin.
diff --git a/llvm/test/CodeGen/AArch64/win-sve.ll b/llvm/test/CodeGen/AArch64/win-sve.ll
index 53ac9344175a3..6bf8e58b9de29 100644
--- a/llvm/test/CodeGen/AArch64/win-sve.ll
+++ b/llvm/test/CodeGen/AArch64/win-sve.ll
@@ -784,8 +784,6 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
; CHECK-NEXT: .seh_proc f6
; CHECK-NEXT: .seh_handler __CxxFrameHandler3, @unwind, @except
; CHECK-NEXT: // %bb.0:
-; CHECK-NEXT: sub sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
; CHECK-NEXT: addvl sp, sp, #-18
; CHECK-NEXT: .seh_allocz 18
; CHECK-NEXT: str p4, [sp] // 2-byte Folded Spill
@@ -853,21 +851,21 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
; CHECK-NEXT: add x29, sp, #16
; CHECK-NEXT: .seh_add_fp 16
; CHECK-NEXT: .seh_endprologue
-; CHECK-NEXT: sub sp, sp, #64
+; CHECK-NEXT: sub sp, sp, #80
; CHECK-NEXT: mov x0, #-2 // =0xfffffffffffffffe
; CHECK-NEXT: addvl x8, x29, #18
; CHECK-NEXT: mov x19, sp
-; CHECK-NEXT: stur x0, [x8, #16]
+; CHECK-NEXT: stur x0, [x8]
; CHECK-NEXT: addvl x8, x29, #18
-; CHECK-NEXT: ldr x1, [x8, #32]
-; CHECK-NEXT: .Ltmp0:
+; CHECK-NEXT: ldr x1, [x8, #16]
+; CHECK-NEXT: .Ltmp0: // EH_LABEL
; CHECK-NEXT: add x0, x19, #0
; CHECK-NEXT: bl g6
-; CHECK-NEXT: .Ltmp1:
+; CHECK-NEXT: .Ltmp1: // EH_LABEL
; CHECK-NEXT: // %bb.1: // %invoke.cont
; CHECK-NEXT: .seh_startepilogue
-; CHECK-NEXT: add sp, sp, #64
-; CHECK-NEXT: .seh_stackalloc 64
+; CHECK-NEXT: add sp, sp, #80
+; CHECK-NEXT: .seh_stackalloc 80
; CHECK-NEXT: ldp x29, x30, [sp, #16] // 16-byte Folded Reload
; CHECK-NEXT: .seh_save_fplr 16
; CHECK-NEXT: ldr x28, [sp, #8] // 8-byte Folded Reload
@@ -932,12 +930,8 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
; CHECK-NEXT: .seh_save_preg p14, 10
; CHECK-NEXT: ldr p15, [sp, #11, mul vl] // 2-byte Folded Reload
; CHECK-NEXT: .seh_save_preg p15, 11
-; CHECK-NEXT: add sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
; CHECK-NEXT: addvl sp, sp, #18
; CHECK-NEXT: .seh_allocz 18
-; CHECK-NEXT: add sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
; CHECK-NEXT: .seh_endepilogue
; CHECK-NEXT: ret
; CHECK-NEXT: .seh_endfunclet
@@ -1160,64 +1154,6 @@ define void @f8(<vscale x 2 x i64> %v) {
ret void
}
-define void @f9(<vscale x 2 x i64> %v, ...) {
-; CHECK-LABEL: f9:
-; CHECK: .seh_proc f9
-; CHECK-NEXT: // %bb.0:
-; CHECK-NEXT: sub sp, sp, #64
-; CHECK-NEXT: .seh_stackalloc 64
-; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: .seh_allocz 1
-; CHECK-NEXT: str z8, [sp] // 16-byte Folded Spill
-; CHECK-NEXT: .seh_save_zreg z8, 0
-; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT: .seh_save_reg_x x30, 16
-; CHECK-NEXT: .seh_endprologue
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: add x9, sp, #8
-; CHECK-NEXT: str x2, [x8, #32]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x0, [x8, #16]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x1, [x8, #24]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x3, [x8, #40]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x4, [x8, #48]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x5, [x8, #56]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x6, [x8, #64]
-; CHECK-NEXT: addvl x8, sp, #1
-; CHECK-NEXT: str x7, [x8, #72]
-; CHECK-NEXT: add x8, sp, #16
-; CHECK-NEXT: addvl x8, x8, #1
-; CHECK-NEXT: str x8, [sp, #8]
-; CHECK-NEXT: //APP
-; CHECK-NEXT: //NO_APP
-; CHECK-NEXT: .seh_startepilogue
-; CHECK-NEXT: ldr x30, [sp] // 8-byte Folded Reload
-; CHECK-NEXT: .seh_save_reg x30, 0
-; CHECK-NEXT: add sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
-; CHECK-NEXT: ldr z8, [sp] // 16-byte Folded Reload
-; CHECK-NEXT: .seh_save_zreg z8, 0
-; CHECK-NEXT: add sp, sp, #64
-; CHECK-NEXT: .seh_stackalloc 64
-; CHECK-NEXT: addvl sp, sp, #1
-; CHECK-NEXT: .seh_allocz 1
-; CHECK-NEXT: add sp, sp, #64
-; CHECK-NEXT: .seh_stackalloc 64
-; CHECK-NEXT: .seh_endepilogue
-; CHECK-NEXT: ret
-; CHECK-NEXT: .seh_endfunclet
-; CHECK-NEXT: .seh_endproc
- %va_list = alloca ptr
- call void @llvm.va_start.p0(ptr %va_list)
- call void asm "", "r,~{d8},~{memory}"(ptr %va_list)
- ret void
-}
-
declare void @g10(ptr,ptr)
define void @f10(i64 %n, <vscale x 2 x i64> %x) "frame-pointer"="all" {
; CHECK-LABEL: f10:
@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) {
; CHECK-LABEL: f15:
; CHECK: .seh_proc f15
; CHECK-NEXT: // %bb.0:
-; CHECK-NEXT: addvl sp, sp, #-1
-; CHECK-NEXT: .seh_allocz 1
-; CHECK-NEXT: str z8, [sp] // 16-byte Folded Spill
-; CHECK-NEXT: .seh_save_zreg z8, 0
-; CHECK-NEXT: str x28, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT: .seh_save_reg_x x28, 16
+; CHECK-NEXT: str x28, [sp, #-32]! // 8-byte Folded Spill
+; CHECK-NEXT: .seh_save_reg_x x28, 32
; CHECK-NEXT: str x30, [sp, #8] // 8-byte Folded Spill
; CHECK-NEXT: .seh_save_reg x30, 8
-; CHECK-NEXT: sub sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
+; CHECK-NEXT: str d8, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT: .seh_save_freg d8, 16
; CHECK-NEXT: addvl sp, sp, #-1
; CHECK-NEXT: .seh_allocz 1
; CHECK-NEXT: .seh_endprologue
-; CHECK-NEXT: addvl x8, sp, #2
+; CHECK-NEXT: addvl x8, sp, #1
+; CHECK-NEXT: addvl x9, sp, #1
; CHECK-NEXT: //APP
; CHECK-NEXT: //NO_APP
-; CHECK-NEXT: stp d0, d0, [sp, #8]
; CHECK-NEXT: ldr w8, [x8, #104]
-; CHECK-NEXT: str w8, [sp, #8]
+; CHECK-NEXT: str d0, [x9, #24]
+; CHECK-NEXT: addvl x9, sp, #1
+; CHECK-NEXT: str d0, [sp]
+; CHECK-NEXT: str w8, [x9, #24]
; CHECK-NEXT: .seh_startepilogue
; CHECK-NEXT: addvl sp, sp, #1
; CHECK-NEXT: .seh_allocz 1
-; CHECK-NEXT: add sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
+; CHECK-NEXT: ldr d8, [sp, #16] // 8-byte Folded Reload
+; CHECK-NEXT: .seh_save_freg d8, 16
; CHECK-NEXT: ldr x30, [sp, #8] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x30, 8
-; CHECK-NEXT: ldr x28, [sp] // 8-byte Folded Reload
-; CHECK-NEXT: .seh_save_reg x28, 0
-; CHECK-NEXT: add sp, sp, #16
-; CHECK-NEXT: .seh_stackalloc 16
-; CHECK-NEXT: ldr z8, [sp] // 16-byte Folded Reload
-; CHECK-NEXT: .seh_save_zreg z8, 0
-; CHECK-NEXT: addvl sp, sp, #1
-; CHECK-NEXT: .seh_allocz 1
+; CHECK-NEXT: ldr x28, [sp], #32 // 8-byte Folded Reload
+; CHECK-NEXT: .seh_save_reg_x x28, 32
; CHECK-NEXT: add sp, sp, #80
; CHECK-NEXT: .seh_stackalloc 80
; CHECK-NEXT: .seh_endepilogue
|
@@ -784,8 +784,6 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr | |||
; CHECK-NEXT: .seh_proc f6 | |||
; CHECK-NEXT: .seh_handler __CxxFrameHandler3, @unwind, @except | |||
; CHECK-NEXT: // %bb.0: | |||
; CHECK-NEXT: sub sp, sp, #16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about SEH to tell if this change is significant (to me, it seems insignificant). I also don't know if this case is fully supported.
@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) { | |||
; CHECK-LABEL: f15: | |||
; CHECK: .seh_proc f15 | |||
; CHECK-NEXT: // %bb.0: | |||
; CHECK-NEXT: addvl sp, sp, #-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is due to the explicit tailcc
on the function. I'm not 100% sure on the correct behaviour here, LowerCall
only uses the SVE CC if the original CC is C
or Fast
(which the new logic matches). https://reviews.llvm.org/D84869 suggests this is not a safe CC for SVE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LangRef says that the calling convention requires the prototype of all callees to exactly match the prototype of the function definition. So I guess adding tailcall
on this function is not correct.
!requiresSaveVG(MF) && !AFI->isSVECC(); | ||
MF.getFunction().getCallingConv() != | ||
CallingConv::AArch64_SVE_VectorCall && | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: whitespace
if (isVarArg) | ||
report_fatal_error("Passing/returning SVE types to variadic functions " | ||
"is currently not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can pass them to variadic functions, but not as variadic arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was confused by the wording of our current errors which say "Passing SVE types to variadic functions currently not supported" (which it appears to check it use is as a variadic argument).
@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) { | |||
; CHECK-LABEL: f15: | |||
; CHECK: .seh_proc f15 | |||
; CHECK-NEXT: // %bb.0: | |||
; CHECK-NEXT: addvl sp, sp, #-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LangRef says that the calling convention requires the prototype of all callees to exactly match the prototype of the function definition. So I guess adding tailcall
on this function is not correct.
Going to close this PR, while I think currently this is valid, I think the main confusing thing about |
I just wanted to add some extra commentary in case somebody comes a looking: It's a nuisance but there is no SVE calling convention. There is the What we did when adding SVE support was to modify the existing calling convention in a compatible way to document the requirement for handling state that was previously not available. This generally applies to all the calling conventions, for example, preserve_all should preserve SVE state if in use. I don't want to get into a situation where we need SVE versions of all the existing calling conventions. That said, I'm not going to fully defend the implementation as we certainly made use of "calling convention" mechanics in order to achieve the desired behaviour. If anything, I'd rather generalise the logic and only use "SVE calling convention" for functions marked with |
We essentially had two different ways of setting the calling convention for a function: the proper calling convention and a target flag to override the calling convention. Removing the flag in favour of setting the calling convention is an NFC except in some (somewhat unsupported) cases.
The cases that needed changes were:
Note: The updated logic intends to closely match the code in
AArch64TargetLowering::LowerCall
(which also changes the CC).